Lexical Database Design: The Shakespeare Dictionary Model

نویسنده

  • H. Joachim Neuhaus
چکیده

1. The Data The S'hakespgare Dt'citbnar~/ (SHAD) project has been using structured databases since 1983. The system is implemented on a PRIME 250-II computer using standard CODASYL-DBMS software and related tools. The project has been able to draw on a vast repository of computerized material dealing with Shakespeare and the English lexicon. Initially, it was part of the "Sonderforschungsbereich I00 Elektronische Sprachforschung" sponsored on the national level by the Deutsche Forschungsgemeinschaft. The research team has been directed by Marvin Spevack and H. Joachim Neuhaus, now both at Miinster, and Thomas Finkenstaedt, now at Augsburg. Spevack's O'oraplete and Syateraah'c ~oncordanoe to the Works o/ ,~hakospeare (Hildesheim and New York, 1968-1978) and Finkenstaedt's Ohronalogt~al En]h)h Dichonarj/ (Heidelberg, 1970), both in machine readable form, were used in a computer-assisted lemmatization procedure (Spevack, Neuhaus, and Finken-staedt 1974). A chronologically arranged dictionary, where entries are sorted according to the year of first occurrence, makes it possible to "stop" the development of the recorded English vocabulary at any desired moment and to compare, for instance, Shakespeare's vocabulary with the corpus of English words recorded up to 1623, when the F, ist Fohb appeared (Neuhaus 1978). The set of words in Shakespeare can be compared with the complement set of words available in Elizabethan English, but not attested in Shakespeare's works. In this way there is a systematic integration into the total vocabulary. As a result, our database model can easily be expanded or transfered to cover larger or different vocabularies. In order to present the complete Shakespearean vocabulary and to disengage SHAD from dependence on a single edition of Shakespeare, the data were expanded to include all :Jtage directions and speech-prefixes in all quartos up to and including the F,'r,,t Foho (Volume VII of the ~omple~e and o°yaJ!emait~ Ctoncordance to the Works o/ S'hakesI, eare), and the "bad" quartos (Volume VIII). Volume IX presents all substantive variants, producing a composite Shakespearean vocabulary in modern and eventually old spelling. In analysing this material a strict differentiation between vocabulary level and text level has been observed. Further data-preparation on the vocabulary level concentrated on formal properties of Shakespearean lemmata, such as morphological structure, or etymological background. There is a complete morphology for all lemmata (ca. 20,000 records), which gives detailed structural descriptions of derivations, compounds, and other combinations, as well as all inflected word-forms, as they occur in the text. The etymological data include word histories and loan …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical vs. Dictionary Databases Design Choices of the MorDebe System

Many lexical databases are modelled simply as digital version of paper dictionaries. However, for many purposes the demands on a lexical database are different from those on a dictionary database. Therefore, the MorDebe database system deviates from the design of dictionary databases in a number of important ways. Firstly, it puts different restrictions on the inclusion of words due to its less...

متن کامل

The Habanera Lexical Knowledge Base Management System

Habanera is a multipurpose multilingual lexical knowledge base that is developed at CRL to be used as a central repository of multilingual lexical data. The knowledge base contains a set of dictionaries and relations between entries, within a dictionary (e.g., synonymy) as well as between entries of different dictionaries (e.g., translation). The format of monolingual lexical entries is left re...

متن کامل

Structural Properties Of Lexical Systems: Monolingual And Multilingual Perspectives

We introduce a new type of lexical structure called lexical system , an interoperable model that can feed both monolingual and multilingual language resources. We begin with a formal characterization of lexical systems as “pure” directed graphs, solely made up of nodes corresponding to lexical entities and links. To illustrate our approach, we present data borrowed from a lexical system that ha...

متن کامل

Navigation: Metaphorical and Real

In preparation for building a dictionary browser, an actual navigational interface (a digital chart plotter) was analyzed for design guidelines to be applied to an interactive semantic navigator. WordNet is a lexical database structured as an inheritance system encoding numerous semantic relationships. The guidelines are applied to the design of a browser for WordNet.

متن کامل

Implementing a Bilingual Lexical Database System

The current state of progress of a research project for the design and development of a bilingual, Italian-English/English-Italian, lexical database system is presented. The aim is to create an integrated system in which a number of monolingual electronic dictionaries and/or lexical databases can be linked through the medium of a bilingual database. In addition, procedures are being implemented...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1986